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1 Introduction 


Nonlinear autoregressive models is very useful for modeling many natural processes, however, 
the size of the class of these models is large. Functional-coefficient autoregressive (FCAR) 
models are useful structures for reducing the size of the class of these models. The FCAR 
model is defined as 


X t = X ( u t) X t - a + cr (Xt) E t , 


t = 1, 


, n, 


( 1 ) 


a =1 


where p and d are positive integers, m a ( Ut ) is a measurable function of the delay variable 
JJt = X t _d, for a = 1,... ,p, a 2 (Xg is a variance function dependent on X t = (X \,..., X n )', 
and {ed is a sequence of i.i.d. random variables with mean 0 and variance cr 2 . Although this 
structure reduces the class of nonlinear models, it is broad enough to include some common 
time series models as specific cases. Among these are the threshold autoregressive (TAR) model 
of Tong (1983), the exponential autoregressive (EXPAR) model of|Haggan and Ozakij (1981), 


and the smooth transition autoregressive (STAR) model of|Chan and Tong (1986). 


Chen and Tsay (1993) introduced the FCAR model and proposed a procedure for building 


the model based on arranged local regression which constructs estimators based on an iterative 


recursive formula that resembles local constant smoothing. Cai, Fan and Yao (2000) used a local 


linear fitting method to estimate the coefficient functions. They used the method on simulated 
data from an EXPAR model and assessed the fit by calculating the square root of the average 


squared errors (RASE). In Huang and Shen (2004) a global smoothing procedure based on 


polynomial splines for estimating FCAR models is proposed. The authors note that the spline 
method yields a fitted model with a parsimonious explicit expression which is an advantage over 
the local polynomial method. This feature allows one to produce multi-step ahead forecasts 
conveniently. Additionally, their spline method is less computationally intensive than the local 
polynomial method. 

A recent development in estimating nonlinear time series data is the spline-backhtted kernel 


(SBK) method of Wang and Yang (2007). This method combines the computational speed of 
splines with the asymptotic properties of kernel smoothing. To estimate a component func¬ 
tion in the model, all other component functions are “pre-estimated” with splines and then the 
difference is taken of the observed time series and the pre-estimates. This difference is then 
used as pseudo-responses for which kernel smoothing is used to estimate the function of inter¬ 
est. By constructing the estimates in this way, the method does not suffer from the “curse of 
dimensionality”. 


°Research project for Spring and Summer 2014 as part of the Research Training Group. Supervised by Joshua 
Patrick. 
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In this paper, we adapt the SBK method to FCAR models. In section 2, the SBK method¬ 
ology is discussed. In section 3, simulation results are used to show the oracle efficiencies of the 
method and we apply the method to real world data in section 4. We conclude with a discussion 
in section 5. 


2 Methodology 


To motivate the estimation method for ([!]), we use the oracle smoothing idea of jLinton (1997) 
and Wang and Yang (2007). Suppose we want to estimate m 7 (Ut) in 0 . If the coeffi¬ 
cient functions m a (Ut), a = 1 ,... ,p, a ^ 7 , are known by “oracle,” then we can construct 
{U t ,X t - 7 , Y 7;t }” =1 , where 

p 

Yy t t — VTl^ (Ut) Xt—ry + <7 (Xf) Et — Xt ^ (Ut) A/. a: ■ 

a= l,a ^7 

from which we can estimate the only unknown function m 7 (Ut). This oracle smoother removes 
the “curse of dimensionality” since there is only one unknown function to estimate. Clearly, 
the coefficent functions, m a (Ut), a = 1 ,... ,p, a / 7 , are not known and must be estimated. 


For additive models, Linton (1997) used marginal integration kernel estimates to estimate the 


functions and Wang and Yang (2007) used an undersmoothed spline procedure. We now adapt 


the procedure of Wang and Yang (2007) to estimate the FCAR model 


We assume the delayed variable Ut is distributed on the compact interval [ a,b ]. Denote 
the knots as a = kq < K\ < ■ ■ ■ < kn < k/v+i = ^ where the number of interior knots 
are N ~ n 2 / 5 lnn. The B spline basis functions are determined on the N + 1 equally spaced 
intervals with length (b — a) (N + l) -1 . The basis function are defined as 


Bj (u) = 


1, KJ < X < K J+ 1, 

0 , otherwise, 

The pre-estimates are defined as 

N+l 

ffi-a (u) 'y ' ^(N+l)(a—l)+jBj ( U ) , 

J=1 


J = 0,...,N + 1. 


oi = !,••• ,P, 


where the coefficients ^Ai,..., A p (jv+i)J are solutions to the least squares problem 
{Ai, ■ ■ ■, A p (jv+i)| = 


n ( 

p / 

'N +1 


\ ) 2 


arg min V] \ X t - V] 

A w+i)( 

a-l)+jBj (Ut.) 

)X t - a } . 

(2) 

R p(iV+l) t=1 ^ 

a=l ' 

\j= 1 


/ J 


(Xi ,...,X n y,Xa 

= (Xi_ a , 

• ■, X n — a ) , 

Ut = (U 1 ,...,U n y,X= (Ai,. 

• Ap(N+1 )) 

B = 

1 

ta ta 

O O 

ci ci 

tO l-Y 1 

Bi(Ut) • 
Bi(U 2 ) • 

•• B N (Ut)' 

■ ■ B n (U 2 ) 
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. B 0 (U n ) 

B\ (U n ) • 

■ ■ B n (U n ) _ 




and Z = (^B o Xi, B o X 2 , • • • ,Bo X P J where o denotes the Hadamard product and X Q is a 
n x (N + 1) matrix with X Q for each column. In matrix notation, the least squares estimates 
are 


A=(Z'Z) l Z’X t , 
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and the pre-estimates are 


rha (Ut) — B ( A( A t + 1 )( q _ 1 ), ..., A a (jv+i)-i) , a — 


We now define the “pseudo-responses” as 

v 


%,t = X t - ^ m a ( U t ) X t -a, t = 1 , 

a=l,a^7 


, n. 


Define the vector of pseudo-responses as Y 7 = ... ,Yy tU ) . The spline-backfitted kernel 

(SBK) estimate for the coefficient function m 7 (u) is 


where 


rhsBKa (u) = (1,0) (V'WV) 1 V'WY 
Xp-\- 1— 7 Xp-\ -l — 7 (Up-\-\ u) 


7’ 


( 3 ) 


V = 


Xjl—J X n —y (Un ^) 

W = diag [K h (Up+i -u),...,K h (U n - «)}, 


/ \ 2 

K h («) = /D 1 ^, ^1 - Q) J h\n/h\<l}, 


j. is an indicator variable equal to one if x and zero otherwise, and h is a bandwidth selected 
by the rule of thumb criterion of Fan and Gij bels| (| 1996). Likewise, we define the oracle kernel 
smoother as 

m 0>7 («) = (V'WV ) _1 V'WY 7 . (4) 


3 Simulation Results 


In this section, we present two simulation results on our finite-sample behavior of the SBK 
estimators of the functional-coefficient autoregressive model. The dataset is generated from the 
FCAR model, 

p 

X t = S ^m a (U t ) X t _ a + a (X t ) s t , t = l,...,n, 

a=l 

The functional-coefficient term is set to be m a (Ut) = A a sin (uinUt) for a = where 

Ut = X t -d is a delayed variable with d = p + 1 and the predictor Xt is generated from the 
distribution of Xt ~ IV(0,1). We ran two sets of simulations, one with p = 4 and one with 
p = 10. When p = 4, we set d = 5, A = (0.5,—0.5,0.5,—0.5)', and u = 4.5. When p = 10, 
we set d = 11, A = (0.5,—0.5,0.5,—0.5,0.5,—0.5,0.5,—0.5, 0.5,—0.5)', and oj = 1.5. For all 
simulated models the error term is ~ N (0,1) with 


a(Ut) = 0A(^)U t 


5 - e 


E P \^i\ 
i= 1 p 


5 + e 


Z-^i =1 p 


which ensures heteroscedasticity with a (Ut) roughly proportional to dimension p. 

To implement the SBK estimater, we choose spline degree to be 0 to ensure that we get 
undersmooth pre-esitmates. The choice of number of knots N n for our spline estimator is, 


N n = min 



+ C 2 , 


n 

2di 
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where c\ and C 2 are tuning constants. The choice of these constants c\ and C 2 makes little 
difference for a large sample, thus we set ci = C 2 = 1. In addition, we want N n <n/ (2d). This 
ensures number of terms for the solution of the least squares problem ([2]) is no greater than 
n/2, which is necessary when the sample size n is moderate and dimension p is high. Both SBK 
estimators rhsBK,a (u) and oracle smoother rho, a («) are obtained by local linear regression 
defined in ([ 3 ]) and Q with the rule-of-thumb bandwidth. 

We ran 500 replications for sample sizes n = 100, 500,1000,1500. For both simulations, 
we chose to estimate the a\ and a 4 components and compared the fit of our SBK estimators 
itt- SBK, a (u) to the oracle smoothers mo,a (u) by the relative efficiency, which is 

ff = £ EILi {rhsBK, a (u) - m a (u )} 2 
6 “ b E?=i {™0,a («) - m a (u)} 2 

Since for small sample sizes, the density of our relative efficiencies are skewed, we chose the 
mode, median, and variance to show the simulated results in Table [l] 


d 

n 

mode 

effi 

median 

variance 

mode 

eff 4 

median 

variance 

4 

100 

0.274 

0.510 

1.009 

0.210 

0.530 

1.429 


500 

0.722 

0.776 

0.634 

0.524 

0.730 

0.905 


1000 

0.787 

0.872 

0.328 

0.735 

0.861 

0.538 


1500 

0.857 

0.890 

0.365 

0.710 

0.831 

1.172 

10 

100 

0.155 

0.362 

4.920 

0.173 

0.349 

10.22 


500 

0.180 

0.378 

1.491 

0.186 

0.338 

0.419 


1000 

0.328 

0.500 

1.555 

0.236 

0.408 

1.150 


1500 

0.347 

0.644 

1.778 

0.287 

0.502 

2.987 


Table 1: Relative efficiency between rhsBK,a (u) and mo,a {u). 


For both dimensions, the relative efficiencies are converging to 1 as the sample size increases. 
However, for high dimensions, the convergence is slower than low dimensions as expected. 
We also expect to see the variance of these relative efficiencies decrease when the sample size 
increases. It seems to be the case for low dimensions except the variance for a 4 when n = 1500 
jumps back up by a large amount. The reason is one of the values is 18.11 which pulls up the 
variance. If this value is removed, the variance is then 0.589. Moreover, the efficiencies are not 
stable for high dimensions. The variance jumps from 10.22 to 0.419 when the sample increased 
from 100 to 500 and then goes back up to 2.987 when sample size is 1500. 
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Efficiency density for p = 4, alpha = 1 


Efficiency density for p = 4, alpha = 4 





Figure 1: Estimated distributions of relative efficiency between rhsBK,a (u) and mo,a (u). 

In Figure [lj the densities of relative efficiency distributions for n = 100, 500, 1000, 1500, 
d = 4, 10 are presented. It is clear to see the relative efficiencies are converging to 1 for low 
and high dimensions. However, for high dimensions, the converge rate is slower than in low 
dimensions. 


4 Application 


In this section, we apply our method to the Australia Quarterly GDP data which is obtained 
from http://stats.oecd.org. The data set contains 217 quarterly Australia GDP indices from Q1 
of 1960 to Q1 of 2014 as shown in Figure [2a] Usually for economics studies such as GDP, it is 
better to take the natural log of the data because it will show differnces more clearly. In order 
to make the series stationary in the mean, we first need to detrend the data. By choosing the 
bandwith for kernel smoothing to be 30, we fit a line to the data which can be seen in Figure 


2b 


To make the series stationary in variance, we took the fourth difference due to the data 
being quarterly. The detrended time series and the differenced-detrended time series are shown 
in Figures [2c] and 2d 
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(a) Quarterly GDP (b) Log of Quarterly GDP 




(c) Detrended (d) Differenced-detrended 

Figure 2: Time series plots for the data: (a) and (b) raw, (c) detrended, (d) differenced and 
detrended. 


Recall for our FCAR model in ([!]), we need to obtain the delay index d of our delay variable 
Ut = X t _d and dimension p of J2a=i m a (Ut) X t - a . Therefore, we estimated the time series 
with different combination of d and p where d = {1, 2,10} and p = {2, 3,.... 10}. Then, the 
combinition of d and p with minimum MSE will be choosen to be our estimation parameters. 
The results are shown in Table [2] The minimum MSE occurs when d = 7 and p = 2. 


d/p 

2 

3 

4 

5 

6 

7 

8 

9 

10 

1 

0.000130 

0.000638 

0.000155 

0.000165 

0.000168 

0.000163 

0.000165 

0.000168 

0.000153 

2 

0.000213 

0.000152 

0.000239 

0.000188 

0.000187 

0.000182 

0.000185 

0.000187 

0.000178 

3 

0.000132 

0.000151 

0.000177 

0.000184 

0.000198 

0.000171 

0.000170 

0.000181 

0.000171 

4 

0.000127 

0.000150 

0.000174 

0.000194 

0.000176 

0.000185 

0.000160 

0.000157 

0.000395 

5 

0.000130 

0.000150 

0.000179 

0.000185 

0.000163 

0.000149 

0.000134 

0.000187 

0.000171 

6 

0.000144 

0.000144 

0.000185 

0.000184 

0.000168 

0.000202 

0.000176 

0.000165 

0.000185 

7 

0.000116 

0.000126 

0.000153 

0.000160 

0.000174 

0.000183 

0.000172 

0.000173 

0.000159 

8 

0.000160 

0.000135 

0.000152 

0.000165 

0.000161 

0.000171 

0.000163 

0.000154 

0.000156 

9 

0.000118 

0.000138 

0.000182 

0.000163 

0.000175 

0.000178 

0.000178 

0.000162 

0.000163 

10 

0.000123 

0.000130 

0.000151 

0.000155 

0.000166 

0.000171 

0.000159 

0.000167 

0.000163 


Table 2: MSE of SBK estimations with different combinitions of d and p. 


If we look at the levelplot in Figure [3j it is easier to see how the MSE values are distributed. 
The darker the color is, the lower the MSE of SBK estimation is. In 3a, even though there are 
two extremely high MSE values, we can still see that low MSE values generally occur when p 
is small. Moreover, when we replace the extreme values with the average, we can see a better 
trend. In 3b, we can see that for this perticuclar FCAR model, low MSE values tend to occur 
when p is small and d is large. 
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\- 0.00024 




\- 0.00022 


0.00020 


0.00018 


0.00016 


0.00014 


0.00012 


(a) levels of MSE 


(b) levels of MSE without extreme values 


Figure 3: MSE of SBK estimations with different combinitions of d and p. 


Since d = 7 and p = 2 gives the minimum MSE value when estimating the FCAR model 
with SBK method, then the FCAR model for this data is 

X t = mi (X t - 7 ) X t -i + m 2 (X t - 7 ) X t -2 + cr (X t ) St- 

To compare our SBK method, we estimated the data again with the ARIMA model with order 

1 , 

Xt = c + i/jXt-i + St 

where c = 0.0042 and i/) = 0.8776. 


GDP vs predicted 



Figure 4: SBK on FCAR estimation and ARIMA estimation. 


The MSE values for both methods are small. For SBK on FCAR model, MSE is 0.0001164. 
For ARIMA model, MSE is 0.0001104. Therefore, our SBK estimation on the FCAR model 
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is a satisfactory fit. In Figure [4j we see that both methods fitted the function well. The two 
coefficient functions we obtained of our FCAR model are shown in Figure [5j These functions 
may be used to help professional economists interpret the data. 



Figure 5: Functional Coefficients of our FCAR Model 


5 Conclusion 

In this paper we have discussed the Spline Backfitted Kernel (SBK) method to estimate the 
Functional-Coefficient Autoregressive (FCAR) models. This method is breaking a p-dimenional 
problem into p-univariate problems, reducing the “curse of dimensionality.” This is achieved by 
first “pre-estimating” all component functions other than the function of interest with splines, 
then the difference between observed time series and sum of pre-estimates are used as pseudo¬ 
responses for kernel smoothing to estimate the function of interest. 

We showed this method is oracally efficient like local linear estimations in one dimension. 
Moreover, the speed of this procedure is very faster. One hundred replications with order 10 
for sample sizes n = 100,500,1000,1500 took about 40 minutes on a Macbook Air with a 1.4 
GHz Intel Core i5 processor and 4GB RAM. The combination of fast computational speed and 
asymptotic accuracy for high dimension regression is very appealing. 
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